Automated modeling of Chinese intonation in continuous speech
نویسندگان
چکیده
We built and trained a model of intonation in continuous Mandarin speech based on the Stem-ML model of interacting accents. With this model, we found that we can accurately reproduce the intonation of the speaker using only one accent template for each lexical tone category. The resulting parameters are interpretable, and we find that the fitted model is consistent with linguistic expectations. Stem-ML is a phenomenological model of the muscle dynamics and planning process that controls the tension of the vocal folds. It describes the interactions between nearby tones or accents.
منابع مشابه
Automated modelling of Chinese intonation in continuous speech
We built and trained a model of intonation in continuous Mandarin speech based on the Stem-ML model of interacting accents. With this model, we found that we can accurately reproduce the intonation of the speaker using only one accent template for each lexical tone category. The resulting parameters are interpretable, and we find that the fitted model is consistent with linguistic expectations....
متن کاملIntonation modeling of Mandarin Chinese using a superpositional approach
The intonation model is an important component in text-tospeech systems to obtain natural and expressive speech synthesis. In this paper we propose a superpositional model for Mandarin Chinese. The intonation model is composed of the syllable and the phrase component. The parameters of the model are estimated using JEMA, a training approach with many advantages related to robustness and precisi...
متن کاملIntegration of Intonation in Trainable Speech Synthesis
Current developments in artificial speech synthesis place more emphasis on spectral continuities and diverse prosodic effects. The trainable HMM-based speech synthesis method has generated more continuous spectral structure than unit selection method in recent study, but the pitch contour generated by HMM-based method trends to be over-smoothed and lacks syllable variance in Chinese. In this pa...
متن کاملImproved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملUsing Zero-Frequency Resonator to Extract Multilingual Intonation Structure
Human uses expressive intonation to convey linguistic and paralinguistic meaning, especially making focal prominence to give emphasis that highlights the focus of speech. Automatic extraction of dynamic intonation feature from a speech corpus and representing it in a continuous form are desired in multilingual speech synthesis. This paper presents a method to extract dynamic prosodic structure ...
متن کامل